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(54) A method and apparatus for processing compressed video data streams 



(57) The present invention relates to the field of dig-, 
ital broadcasting, and more- particularly the insertion of 
digital video streams into other digital video streams. 

Compressed digital video streams, such as those 
compressed using the common MPEG-2 system, use a 
sequence of t rames to cornp ress a video sequence. Part 
of the encoding method to compress frames involves 
making predictions based on pasl or future frames. 

Where part of a compressed video stream is to be 
inserted into another existing video stream, problems 
may arise at the insertion point due to dependencies on 



past Orf uture frames which occur outside of the insertion 
point. The effect of this is that the decoding process 
lacks information on which to make its predictions, .and 
this could cause a decoder to reset or display frames 
out of order. 

The present invention overcomes this problem in a 
way which allows frame accurate insertion to be 
achieved without compromising quality. 

The present invention can be used to pre-process 
a compressed video stream ready for insertion, or can 
be used to dynamically insert a compressed video 
- stream into an existing compressed video stream, 
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Description 

[0001] The present invention relates to the field of dig- 
ital broadcasting, and more particularly the insertion of 
digital video streams into other digital video streams. 
[0002] Digital television involves the generation and 
storage of enormous quantities of data. Digital compres- 
sion techniques can be applied to this data to drastically 
reduce the volumes of data required for transmission 
and storage. One compression technique common in 
the field is MPEG-2. MPEG-2 compresses video data 
by "removing or reducing redundancy inherent in many 
types of image or video sequences. MPEG-2 makes use 
of three different types of frame which enable redundan- 
cy to be minimised. The three types of frames used are 
known as I frames, B frames and P frames. 
[0003] I frames contain information which allows a 
complete frame to be reconstructed^ rom only the data 
contained within the I frame. P Irames use a single pre- 
viously reconstructed frame as the basis for temporal 
prediction calculations. P frames base their predictions 
on the nearest I or P frame, and this is known as forward 
prediction. B frames use bi-directionally interpolated 
motion prediction to allow a decoder to rebuild a frame 
that is located between two reconstructed display 
frames. B frames use both past frames and future 
frames to make their predictions and require more than 
two frames of video storage. MPEG-2 video streams are 
made of a sequence of I, P and B frames which describe 
the video sequence. 

[0004] The decompression and display of MPEG-2 
compressed video streams may therefore rely on both 
past and future frames. Given the nature of compressed 
video streams, certain functions such as editing and in- 
sertion of bit-streams becomes problematic. If a com- 
pressed video sequence is cut at any point in time it is 
likely the frame immediately prior to the cut may well be 
dependent on information contained in subsequent 
frames to complete the decoding pro^ss Also, if a vid- 
eo sequence is inserted into a video stream it is also 
likely that the first frames of the video sequence are 
based on previous frames which" no longer exist. 
[0005] If any frame required by the decoding process, 
such as a previous or future frame, is missing, this will 
lead to temporary breakdown of the decoding process 
until the next I frame is received. This will result in a tem- 
porary reduction of quality of the decoded image. 
[0006] The problem could be avoided If frame accu- 
.rate Insertion is not required, however this is not a suit- 
able solution for the broadcaster. 
[0007] A problem therefore arises where frame accu- 
rate editing and insertion of compressed video bit-, 
streams is required. 

[0008] Accordingly, one object of the present inven- 
tion is to provide a method and apparatus to enable 
frame accurate editing and insertion of compressed vid- 
eo streams. 

[0009] According to one aspect of the present inven- 



tion there is provided a method of processing a com- 
pressed digital bit-stream including a sequence of tem- 
porally referenced frames, at least some of which are 
coded in dependence on information in preceding or 

5 succeeding frames, to allow the bit-stream to be insert- 
ed into another such digital bit-stream, the method com- 
' prising the steps of: identifying the presence of one or 
more frames at a given insertion point which are coded 
in dependence upon one or more frames beyond the 

io insertion point; and modifying the sequence so as to re- 
move any such dependency and maintain continuity of 
the temporal references. 

[0010] According to a second aspect of the present 
invention there is provided apparatus for processing a 

is compressed digital bit-stream including a sequence of 
temporally referenced frames, at least some of which 
are .coded in dependence on information in preceding 
or succeeding frames, to allow the bit-stream to be in- 
serted into another such digital bit-stream, the appara- 

20 tus comprising the steps of: a detector for identifying the 
presence of one or more frames at a given insertion 
point which are coded in dependence upon one or more 
frames beyond the insertion point; and a processor for 
modifying the sequence so as to remove any such de- 

25 pendency and maintain continuity of the temporal refer- 
encing. 

[0011] The invention will now be described, by way of 
example, with reference to the following diagrams, in 
which: 

30 

Figure 1 is a diagram showing an overview of the 
broadcasting system according to the present in- 
vention; 

Figure 2 is a diagram showing a typical frame se- 

35 quence in display order of MPEG-2 video frames; 
• Figure 3 is a diagram showing the effect of conven- 
tional B frame re-ordering for transmission; 
Figure 4 is a diagram showing a video sequence 
where the insertion point falls on-an-even4rame^ 

40 Figure 5 is a diagram showing a video sequence 
where the insertion point falls on an odd frame; 
Figure 6 is a diagram showing a modification of the 
video sequence of Figure 5; 
Figure 7 is a diagram showing a modification of the 

45 video sequence of Figure 6; 

Figure 8 is a diagram showing a modification of the 
video sequence of Figure 7; 
Figure 9 is a diagram showing the preferred modi- 
fications of a video sequence; 

50 Figure 1 0 is a diagram showing one embodiment of 
the present invention. 

[001 2] Figure 1 is a diagram showing a broadcast sys- 
tem according to the present invention. A national 
55 broadcast 100 Is encoded by an encoder 101 which 
compresses the input data into a compressed or encod- 
ed digital bit-stream 108. A switch 104-is used, in this 
example, to insert regional adverts from a database 103 
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into the national broadcast stream 108. 
[0013] A number of other encoders, one of which is 
shown at 102, also encode other input signals into com- 
pressed digital bit-streams. Each of the encoded bit- 
streams are input to a multiplexer 105 which multiplexes 
each of the individual bit-streams to form a single mul- 
tiplexed bit-stream ready for transmission via a trans- 
mission network 106. The transmission network could 
include a satellite, cable, microwave, terrestrial or other 
broadcasting network. The transmitted bit-stream is ca- 
pable of being received by an appropriate decoder, one 
of which is shown at 107. 

[0014] An MPEG data stream comprises a continuous 
series of coded frames, consisting of i, B and P frames. 
As indicated above, many of the frames in the data 
stream are critically dependent upon their predecessors 
or successors due to the usage of both forward and 
backward prediction. There are many ways in which a 
data stream could be encoded and two common formats 
are 'Single B Frame' and 'Double B Frame'. These for- 
mats relate to the arrangement of the different types of 
frames in a frame sequence. A frame sequence is 
grouped into a unit known as a group of pictures, more 
commonly referred to as GOP. The number of frames in 
a GOP, known as the GOP length, varies according to 
the format of encoding employed by an encoder. 
[0015] Single B Frame encoding produces' a frame 
sequence as follows: "IBPBPBPB...". After an initial I 
frame, there follows a sequence of alternate B and P 
frames. Single B frame encoding is more commonly 
used for the PAL television standard and usually has a 
GOP length of 12 frames.. 

[0O16] Double B Frame encoding produces a frame 
sequence as follows: "IBBPBBRBB...". After an initial I 
frame, there follows a sequence of two B frames and 
one P frame. Double B frame encoding is more com- 
monly used for the NTSC television standard and usu- 
ally has a GOP length 15 frames. 
.[00171 The type of-encoding-can-be selected at the 
encoder according to the requirements of the broadcast- 
er 

[001 8] Referring now to Figure 2, there is shown a typ- 
ical frame sequence of single B frame encoded video 
frames. The frame sequence comprises a number of dif- 
ferent i frames, P frames and B frames. The letters 
shown in the diagrams denote the type of frame: I for an 
I frame, P for a P frame and B for a B frame. The sub- 
scripted numerals denote the temporal reference which 
indicates the order in which the frames will be displayed 
by the decoder. The term 'IN point' is used to denote the 
first frame of a video sequence and 'OUT point' is used 
to denote the last frame of a sequence to be inserted 
into an existing data stream. The sequence of frames 
between the 'IN point' and the 'OUT point' is referred to 
as a video-sequence. This notation is used throughout 
this specification. 

[001 9] Figure 3 shows how a frame sequence is con- 
ventionally re-ordered for transmission. This basically 



involves swapping any B frames with the next frame in 
• the sequence. This ensures that the decoder receives 
the frames in the correct order for decoding and making 
the necessary predictions. 

£ [0020] In order to insert a video sequence into an ex- 
isting data stream the video sequence will need modi- 
fying after the 'IN point' and potentially before the 'OUT 
point 1 to ensure thai the insertion is seamless or near 
seamless i.e. that the insertion is not apparent to the 

io viewer. 

[0021] Considering the 'IN point', any B frames imme- 
diately following the first I frame and before the first P 
frame will potentially reference (or be predicted from) a 
previous P frame that no longer exists because it occurs 
is before the 'IN point' and does not therefore form part of 
the current video sequence. 

[0022] An 'OUT point' is likely to occur anywhere with- 
in the GOP and probably on a B or P frame. If the se- 
quence duration is of an odd number of frames (where 

20 the GOP length is even and single B frame encoding is 
used) then a problem can occur whereby the last frame 
in the sequence has a temporal reference that does not 
follow on from any previous temporal reference. Figure 
5 is a diagram showing a video sequence in which the 

25 'OUT point' falls on an odd numbered frame. In4his ex- 
ample, the frame sequence is missing the temporal ref- 
erence 10, and this causes a discontinuity in the tempo- 
ral references within the video sequence. This will most 
likely cause ^he decoder to display frames out* of se- 

30 quence and/or reset decoding, disrupting the display of 
the video sequence. This problem is compounded 
where 'double B frames' are employed as two temporal 
references may be missing. 

[0023] The present invention provides a solution to 
3& these problems and provides a method and apparatus 
for seamlessly or near seamlessly inserting bit-streams 
into existing bit-streams. 

• [0024] The process of modifying the bit-streams to 
provide seamless or near seamless-insertion can be di- 
40 vided into two separate processes. One processing the 
start of .the sequence and the second processing the 
end of the sequence 

[0025] immediately after an 'IN point', any frames hav- 
ing a dependency on other frames occurring before the 

45 MN point' must be modified to allow the video sequence 
to be inserted into an existing data stream. The only 
frames which are dependent on previous frames are B 
frames, since they are bi-directionally predicted. It fol- 
lows that any B frames immediately following the first I 

so frame and.before the next P frame will require modifica- 
tion and must either be removed or forced to make only 
forward predictions. 

[0026] One solution is to replace any B frames with 
■ null B frames. A null frame removes any problems as- 
55 sociated with dependency on other frames by causing 
the decoder to perform a freeze frame or fade between 
the adjacent frames. A null frame-in this sense is aframe 
in which the macroblocks are not coded. 
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[0027] A null frame is produced when all elementary 
stream syntax associated with the frame is replaced by . 
zero values, known as stuffing. In addition to the stuffing 
a number of control bytes are aiso inserted after each 
slice header. The following parameters are used in the s 
first and last macroblock of each slice to produce a null 
frame: 

Coded_bfock_patlern f cpb = 0 
Motion_vectors - 0 

Macroblock header only i o 

[0028] After the first macroblock the 
macroblock_address.jncrement is increased to the last 
macroblock-1 , then the last macroblock is coded the 
same as the first. 

[0029] The concept of null frames can also be applied is 
to P frames. 

[0030] The end of a video sequence will potentially ai- 
so need processing to enable seamless or near-seam- 
less insertion into an existing data stream. The following 
examples are based on single B frame encoding having 20 
an even GOP length. 

[0031] Figure 4 illustrates a frame sequence having 
an even number of frames, the last frame in the se- 
quence being a B frame with a temporal reference of 8. 
As can be seen, the previous frame P 9 (a P frame) has ss 
a temporal reference of 9 and will therefore be displayed 
by a decoder directly after the display of the B frame B 8 . 
The 'anchor' frames (i.e. the frames from which the B 
frame is predicted) for B8 are P 7 and P 9 , therefore this 
'OUT point' is complete since all of the frame depend- 30 
encies are within the video sequence, and hence re- 
quires no post-processing. 

[0032] Now consider Figure 5, representing a se- 
quence with a duration comprising an odd number of 
frames in which the last frame in the sequence is a P 35 
frame and is shown as P 11 . The previous frame, B 8 , has 
a temporal reference of 8 and this is preceded by a P 
frame P 9 with a temporal reference of 9. The frame with 
a t emporal reference of 10 (the next B frame) B 10 is no 
longer present in the truncated sequence and is de- 40 
pendent on a frame which is not part of the video se- 
quence. This then gives rise to a discontinuity in the tem- 
poral references which is likely to upset the decoding 
process, most likely leading to frames being displayed 
out of order and/or causing the decoder to reset. There- 45 
fore, this 'OUT point 1 is not complete and' requires post- 
processing to allow insertion into another data stream. 
[0033] To address the problem associated with odd 
length GOPs, one solution is to change the P frame to 
a null P frame and change the temporal reference as- so 
sociated with it from 1 1 to 1 0. This is illustrated in Figure 
6. This then gives a smooth increment of the temporal 
reference through to the new 'OUT point'. Changing the " 
P frame to a null P frame produces a freeze frame at the 
end of the sequence and before the start of a new se- SB 
quence. This method has limited success due to the fact 
that t he . ne w temporal reference, and frame type is not 
what the decoder expects to see and this may result in 



frames being repeated or displayed out of order. Fur- 
thermore, the changing of the P frame to a null P frame 
gives rise to a freeze frame based on the P 9 frame, a 
frame some 2 frames earlier in display order. 
[0034] An improvement on the above solution is 
shown in Figure 7. Note now that the last P frame has 
been replaced by a null B frame and the temporal refer- 
ence is again changed, but this time to 9. Also, the tem- 
poral reference applied to the previous P frame has 
been changed from 9 to 10. This improves on the first 
scheme by 'p re-warning' the decoder to expect the tem- 
poral references 8 and 9 following the P frame, P 10 , and 
thus results in a continuous increment of the temporal 
references. 

[0035] A further improvement is shown in Figure 8. 
The 'swapping* of the frame syntax (i.e. the swapping of 
which frame is changed to a null frame) addresses the 
problem of the freeze frame produced in Figure 6. 
[003S] Referring now to Figure 9 there is shown the 
preferred solution according to the present, invention 
which produces a video sequence Which represents the 
best compromise in terms of both the. pictures displayed 
and the temporal reference ordering for the decoder. 
Figure 9a represents a video sequence prior to trans- 
mission with the temporal references in ascending or- 
der. Figure 9b shows the video sequence after frame 
reordering has taken place. 

[0037] Figure 9c shows that the temporal reference 
on P 9 (the penultimate P frame in the video sequence) 
is incremented to 10. A null B frame is then inserted after 
P 10 . The effect of this is to remove P frame P ni from the 
sequence. The temporal references on the final B 
frames are then modified to have a continuous incre- 
mentation. Figure lOd shows the resultant pictures dis- 
played follow almost completely linearly and includes a 
fade frame that comprises an interpolation of frames 7 
and 9, This solution when applied as a post-processing 
function has the best chance of being displayed correct- 
ly by .thej decoder,, particularly, where the decoder imple- 
mentation detail is not known, or the decoder population 
is mixed. 

[0038] Anyone skilled in the art would appreciate that 
the techniques described above could equally be ap- 
plied, with the relevant modifications, to other formats 
of frame encoding, including double B frames and dif- 
ferent length GOPs. 

[0039] Figure 1 0 is a diagram showing an overview of 
one embodiment of the present invention. 
[0040] A compressed video stream is stored in a stor- 
age device 100. The stored video stream, as described 
above, may require modifications to enable it- to be in- 
serted into an existing video stream. A controller 101 
looks at both the start of the video stream and the end 
of the video stream and controis the modifications that 
are required to 'repair' any problems with the frame se- 
quence. 

[0041] The-start of the video sequence may start on ' 
either an I, P or B frame depending on the position of 
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the 'IN point'. The controller 101 looks at the start of the 
sequence and identifies any B frames which occur be- 
tween the first ! frame and before the next P frame. The t 
controller causes these B frames to be replaced with null 
B frames through a switch 105. The coding of null s 
trames removes any problems associated with refer- 
ence to other frames by causing the decoder to perform 
a freeze frame or a lade between two sequences. 
[0042] If the controller 101 detects a frame depend- 
ency outside of the current video sequence, it inserts a 10 
null B frame, via a switch 105, after the penultimate P 
frame in the sequence. This has the effect of removing 
the last P frame from the sequence. The temporal ref- 
erence modifier then modifies the temporal reference as 
shown in Figure 9c. This ensures that the temporal ref- « 
erences form an incremental sequence when displayed 
' by a decoder The amended sequence is then stored in 
a storage device 1 04. The result of this operation is that 
the compressed video stream can now be seamlessly 
or near seamlessly inserted into another video stream 20 
processed in the same way. 

[0043] The present invention can be used to process 
stored video sequences such thatlhey are ready 1or in- 
sertion with an existing compressed video stream, as 
described above. Alternatively, the present invention 25 
could be used at a video insertion switch, such as the 
switch 104 of Figure 1, which would accept unmodified 
compressed video streams and would prepare them for 
insertion using the method of the present invention in 
real-time. 30 
[0044] The present invention has. particular applica- 
tion where regional or national adverts are to be inserted 
into an existing compressed video stream. It can equally 
be used to create sequences which are to be continu- 
ously looped. 55 



Claims 



1. A method of processing a compressed digital bit- 40 
stream including a sequence of temporally refer- 
enced frames, at least some of which are coded in 
dependence on information in preceding or suc- 
ceeding frames, to allow the bit-stream to be insert- 
ed into another such digital bit-stream, the method 45 
comprising the steps of: 

identifying the presence of one or more frames 
at a given insertion point which are coded in de- 
pendence upon one or more frames beyond the 
insertion point; and 

modifying the sequence so as to remove any 
such dependency and maintain continuity of 
the temporal references. 

55 

• 2. The method of claim 1 , wherein the step of modify- 
ing the sequence includes changing the type of 
frames in the sequence. 



3. The method of claim 1 or 2, wherein the step of mod- 
ifying the sequence includes-changing the type of 
frames to null frames. 

4. The method of claim 1, 2 or 3, wherein the step of 
modifying the sequence, further includes selective- 
ly modifying the temporal references to ensure that 
the frames will be displayed in the correcl order by 
a decoder. 

5. Apparatus for processing a compressed digital bit- 
stream including a sequence of temporally refer- 
enced frames, at least some of which are coded in 
dependence on information in preceding or suc- 
ceeding frames, to allow the bit-stream to be insert- 
ed into another such digital bit-stream, the appara- 
tus comprising the steps of: 

a detector for identifying the presence of one or 
more frames at a given insertion point which are 
coded in dependence upon one or more frames 
beyond the insertion point; and 
a processor for modifying the sequence so as 
to remove any such dependency and maintain 
continuity of the temporal referencing. 

6. The apparatus of claim 5, wherein the processor is 
adapted to modifying the sequence by changing the 

. type of frames within the sequence. 

7. The apparatus of claim 6, wherein the processor is 
adapted for modifying the sequence by changing 
the type of frames to null frames. 

8. The apparatus of claim 5, 6 or 7, wherein the proc- 
essor is adapted for selectively modifying the tem- 
poral references to ensure that the frames will be 
displayed in the correct order by a decoder. 



9. A method of transmitting a digital bit-stream proc- 
essed according to the method of any of claims 1 
to 4. 

10. Apparatus for transmitting a digital bit-stream cre- 
ated by the apparatus of any of claims 5 to 8. 
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(54) A method and apparatus for processing compressed video data streams 



(57) . The present invention relates to the field of dig-. 
Ital broadcasting,. and more particularly the insertion of 
digital video streams into other digital video streams. 

Compressed digital video streams, such as those 
compressed using the common MPEG-2 system, use a 
sequence of frames to compress a video sequence. Part 
of the encoding method to compress frames involves 
making predictions based on past or future frames. 

Where part of a compressed video stream is to be 
inserted-intolanother .existing video stream, problems 
.may arise at the insertion point due to dependencies on 



past orfuture frames which occur outside of the insertion 
point. The effect of this is that the decoding process 
lacks information on which to make its predictions, and 
this could cause a decoder to reset or display frames 
out of order. 

The present invention overcomes this problem in a 
way which allows frame accurate insertion to be 
achieved without compromising quality. 

The present invention can be used to pre-process 
a compr_e.s^e4jdileJ5_sIceam ready for insertion, or can 
be used to dynamically insert a. compressed video 
stream into an existing compressed video stream. 
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